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Abstract 

Digital compression of video images is a pos- 
sible avenue for HDTV transmission. Com- 
pression needs to be optimized while picture 
quality remains high. Two techniques for com- 
pressing digitized images are explained, and 
comparisons are drawn between the human 
vision system and artificial compression tech- 
niques. Suggestions for improving compres- 
sion algorithms through the use of neural and 
analog circuitry are given. 

I Introduction 

Digital compression algorithms are being ex- 
amined closely for possible use in High Defi- 
nition Television (HDTV) transmissions. For 
digital transmission of HDTV pictures to be- 
come a reality, broadcast quality video needs 
to be compressed in order to remain within 
typical band widths allotted for TV. Figure 1 
shows a block diagram of a general transmis- 


Figure 1: Typical HDTV Transmission 


sion of digital information. The compression 
can be done in either a lossy or lossless man- 
ner. Lossy compression involves the use of es- 
timating a “continuum^ of values with a dis- 
crete set of points, each of which represents a 
range of values. This estimation, or quanti- 
zation, causes some loss of information. Loss- 
less compression involves no loss of informa- 
tion; it makes use of statistical properties of 
the data being transmitted. A Huffman code 
is a good illustration of lossless compression: 
code words having higher a priori probabilities 
are assigned shorter word lengths, thus mini- 
mizing the code length. 

II Vector Quantization 

One popular form of digit al compression is vec- 
tor quantization ( VQ). VQ is a lossy algorithm 
that involves dividing the screen into regular 
blocks of pixels, e.g. 2 pixels on a side. If each 
pixel is quantized into an 8 bit word, then the 
4 pixel block, or tile, can be represented as a 32 
bit vector. This vector can be used as a search 
key in an associative memory lookup [1] to find 
the closest match amongst a predetermined 


1 


( fc* A > A — L * J ”1 9 0 b 0 V 3 DIGITAL COHPRF SSIQN 
aLG'VT! !'HM C c uP Hu TV TRANSMISSION (Ohio 
jtdte uni v* ) 7 p 


N<?2-2867 3 


Unci as 

G 3/ 3 2 0 1 0 6 i> 3 a 


“codebook' 1 of vectors. The codebook is a set 
of relatively few vectors representative of all 
possible tiles. Since the encoder and decoder 
both have a copy of the codebook, only the 
index of the chosen code vector is sent. Thus, 
if the codebook contained 64 codewords, then 
only 6 bits are needed to specify the represen* 
tative vector, giving a compression of 32:6 = 
5.3 and a bit per pixel ratio of 6 bits/4 pixels 
= 1.5 bpp. 

Proper codebook generation is a significant 
factor in determining VQ performance [2]. In 
principle, the codebook could be updated to 
match the image it was representing; however, 
in practice synchronizing the encoder to the 
decoder too frequently would negate the band- 
width reduction of VQ compression. The man- 
ner in which the algorithm handles edge fea- 
tures is also significant. The human eye places 
a heavy emphasis on edge detection, thus it is 
important that image reconstruction near edge 
features be sharp and consistent. 

Generally, codebooks are generated using 
the most frequently repeated tiles in an image. 
This method of codebook generation performs 
well on average, but features that appear infre- 
quently within the image are not handled well, 
since they are not represented in the codebook. 
Unfortunately, edge features may often fall 
into this category. The larger the codebook 
can become, the more accurately it is able 
to describe the image to be compressed; but 
at the same time, the vector matching hard- 
ware becomes more complex, and the num- 
ber of bits sent per codebook index needs to 
be increased. The ability to trade off picture 
quality for compression performance may de- 
sirable in some applications, but not necessar- 
ily HDTV, where both properties need to be 
optimized. 


Figure 2: Enhanced DPCM CODEC 

III Enhanced DPCM 

The enhanced DPCM video data compres- 
sion CODEC [3] utilizes four techniques to 
achieve data compression: differential pulse 

code modulation (DPCM), non-adaptive pre- 
diction, non-linear quantization, and multi- 
level Huffman coding. The compression al- 
gorithm incorporates both lossy and lossless 
mechanisms in these four techniques. The en- 
coder portion of the CODEC performs the 
data compression, and the decoder recon- 
structs the image from the compressed image 
data. A block diagram of the digital portion 
of the CODEC is shown in Figure 2. The in- 
put to the encoder is a digitized NTSC video 
signal, sampled at four times the color subcar- 
rier frequency (nominally 14.32 MHz) with a 
resolution of 8 bits per sample (pixel). 

DPCM and non-adaptive prediction are pre- 
dictive techniques used to reduce the amount 
of redundancy in the video image. DPCM 
predicts the value of the current pixel based 
on the average of two neighboring pixels of 
the same color subcarrier phase, the 4t.h pre- 
vious pixel on the same line and same pixel 
from two lines previous. Redundancy is re- 
moved by subtracting the DPCM predicted 
value (PV in figure 2) from the current pixel 
value (PIX) resulting in a difference value. 
The non-adaptive predictor attempts to re- 
move more redundancy by predicting the dif- 
ference value resulting from the DPCM pre- 
diction. Using the fact that the difference val- 
ues for neighboring pixels are similar, the non- 
adaptive predictor values are based on the dif- 
ference values of the previous pixel, and these 
were determined using statistical analysis on a 
number of sample images with widely varying 
picture content. The predicted value (NAP), 
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Figure 3: Look-up Table Values 

which is contained in a look-up table (see Fig- 
ure 3) in the hardware, is subtracted from the 
DPCM difference value resulting in a new dif- 
ference value (DIF). 

The resultant difference value (DIF) is then 
quantized into one of 13 quantization levels 
(QL). Each quantization level (QL) has an as- 
sociated quantization value (QV) which is rep- 
resentative of the difference value (DIF). The 
difference values are quantized non- uniformly 
to allow more resolution for the quantization 
levels with small difference values, and less res- 
olution for large difference values. This non- 
uniform distribution is chosen because the hu- 
man eye is more sensitive to small variations 
in smooth regions of an image (where the dif- 
ference values will be small) than it is to large 
variations at transition boundaries (large dif- 
ference values). Like the NAP values, the 
quantization values were determined by sta- 
tistical analysis on sample images and the QL 
and QY values are contained in look-up tables 
in the encoder hardware memory. 

The final data compression technique used 
in the enhanced DPCM CODEC is multilevel 
Huffman coding [4]. The quantized difference 
values are assigned Huffman codes based on 
the probability of occurrence of the quantiza- 
tion level. The most probable quantization 
levels are assigned the shortest length Huff- 
man codes. Multilevel Huffman codes are used 
to take advantage of the fact that neighbor- 
ing pixels fall into the same or close to the 
same quantization level. Fourteen Huffman 
code sets are used - one for each of the 13 
quantization levels and one for start-up pur- 
poses. The Huffman encoder is implemented 
in a look-up table addressed by QL(N) and 
QL(N-l). The output of the Huffman encoder 
is a Huffman code up to 12 bits long and a 4- 


bit code indicating the length of the Huffman 
code. 

The enhanced DPCM algorithm enjoys an 
advantage over vector quantization in the 
manner in which it handles irregular features. 
Whereas VQ has no good means of repre- 
senting features that appear infrequently, the 
DPCM scheme is able to simply transmit 
longer codewords for unexpected events. The 
effect is a slight rise in the average bpp ratio in 
return for higher resolution and better image 
quality near edges. 

The enhanced DPCM CODEC is able to 
compress a digitized NTSC image from 8 
bits/ pixel to an average of 1.8 bits/pixel. The 
reconstructed image quality is excellent and is 
indistinguishable from the original image. 

IV Retinal Image Map- 
ping 

Due to the proven efficiency of the retina's im- 
age processing, it seems reasonable to match 
artificial image compression to the natural 
compression used by the human vision sys- 
tem. The photoreceptors on the retina form 
a signal proportional to the logarithm of the 
incoming light intensity. The signal is then 
spatially and temporally averaged with neigh- 
boring photoreceptor signals. A difference be- 
tween the photoreceptor output and the aver- 
aged signal is formed, and it is a function of 
this signal on which the brain operates. 

The DPCM compression technique is sim- 
ilar to the manner in which actual physio- 
logical systems “encode” visual information. 
The non-linear quantization can be compared 
roughly to the logarithm function, the pre- 
dicted value (PV) corresponds to a spatial av- 
erage, and the difference value is the final out- 
put in both systems. Qualitative performance 
of the two systems is also similar. If the video 
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image is shifted slightly, both the DPCM al- 
gorithm and the retinal mapping produce an 
output similar to but slightly shifted from that 
of the original image. On the other hand, the 
VQ system places artificial boundaries in the 
image: any shift of the video image could result 
in a very different ordering of the transmitted 
code indices. We will explore methods of re- 
moving redundancy with loss in a manner that 
is similar to the losses incurred in the retina. 

V Channel Error Effects 

In addition to compressing video data, the 
manner in which an algorithm can detect and 
correct bit errors introduced by a noisy com- 
munications channel is another important per- 
formance criterion. Errors must occur at rates 
that are not visible or bothersome to the 
viewer. 

The VQ compression algorithm has perhaps 
its biggest advantage in terms of error perfor- 
mance. Since each block is represented by a 
fixed length codeword, any errors that occur 
are localized to the block in which the error 
occurred: there is no cascading effect. This 
is analogous to the biological system in that 
each fixed length codeword maps to an optical 
nerve. The small amount of information that 
is lost in a single block of data or along a sin- 
gle nerve is so negligible that processing can 
continue uninterrupted. 

One drawback to the DPCM algorithm is 
its non-graceful performance degradation due 
to bit errors. Due to their variable length na- 
ture, Huffman encoded data is extremely vul- 
nerable to bit errors. When an error occurs in 
the Huffman code data, proper decoding is un- 
likely and will result in loss of synchronization 
of codeword boundaries. In the case of video 
data compression, loss of synchronization will 
result in poor image quality. 

The enhanced DPCM CODEC employs line 


and field resvnchronization techniques to re- 
duce the effects of channel errors on the re- 
constructed image quality. A unique word is 
inserted into the compressed data stream at 
the beginning of each video line and each field 
(a different unique word is used for the lines 
and the fields) by the encoder. The decoder 
detects the occurrence of the unique words and 
also counts the number of Huffman codes re- 
ceived for each line and each field. If the lo- 
cation of the unique words and the beginning 
of the video lines or fields do not occur at the 
same point, then an error has occurred some- 
where in the Huffman code data. 

Control of a decoder FIFO buffer is used to 
maintain synchronization on a line and field 
basis. The FIFO controls work in conjunction 
with the unique words to reduce the impact of 
channel errors to merely streaks in a video line 
rather than a total loss of synchronization. In 
low error rate channels (less than 10” 6 ), the 
streaks caused by the errors may not be visi- 
ble or bothersome to a viewer. This error rate 
corresponds to approximately one bit error per 
frame in NTSC transmissions. However, in a 
higher error rate channel (greater than I0~ 6 ) 
the image degradation will be unacceptable. A 
better form of error detection/ correction needs 
to be employed when using variable length 
codes like Huffman codes in a noisy commu- 
nication channel. 

VI Algorithm 

Improvements 

One approach for reducing the effects of chan- 
nel errors on image quality is to introduce re- 
dundancy in the form of error checking bits. A 
high rate convolutional encoder coupled with 
a maximum likelihood decoder at the receiver 
would give a coding gain of 2-4 db. depend- 
ing on the complexity of the decoder. Such 
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a system lends itself well to serial transmis- 
sion and would reduce error rates in virtually 
all cases to acceptable viewing levels; however 
in addition to slightly larger bandwidth re- 
quirements, increased circuit complexity is one 
price paid for such a gain. For example, win- 
ner take all circuits, which are characteristic of 
maximum likelihood estimation, would have to 
be included. Also, neural networks have been 
shown to produce good non-linear estimates of 
maximum likelihood sequences [5]. 

Since the VQ algorithm performs so well in 
localizing errors to single pixels, the enhanced 
DPCM algorithm could be modified to take 
advantage of this characteristic. If the man- 
ner in which the NAP value was derived could 
be changed to an adaptive form, the difference 
value could (almost) always be driven to one 
of four quantization levels. If this were the 
case, then two bits per pixel could be sent, 
and the transmission would be very robust 
against channel disturbances. Unfortunately 
this method could introduce a further loss of 
information since we are limiting the possible 
values of DIF to a subset the actual values that 
it can attain. 

Another approach to creating constant word 
length representations of difference values with 
the enhanced DPCM algorithm is to include 
more of the neighboring pixels in the pre- 
diction of the current pixel. In most cases 
this w T ould improve the predicted value’s es- 
timate of the current pixel. The retina is able 
to change the size of the neighborhood de- 
pending on the activity of patterns within a 
given pixel s range. For example, in a field of 
constant pixel values, the neighborhood will 
expand to average in more of the surround- 
ings; whereas in areas where there are rapidly 
changing pixel values, the neighborhood is 
shrunk, since values further away are not rep- 
resentative of the pixel of interest. Thus we 
would want to include some mechanism for 
varying the neighborhood size. Harris et al. [6] 


accomplish this task through the use of resis- 
tive fuses in their hardware implementation of 
the retina. The resistive fuse acts as a resistor 
as long as voltage differences between pixels 
remain low*; if they grow too large, the com- 
ponent becomes a fuse, effectively cutting off 
communication between pixels. 

Another aspect of the retina is its ability 
to remove spatial distortion resulting from the 
changing media through which light signals 
pass. This distortion is analogous to the tem- 
poral smearing of a transmitted digital pulse 
as it passes through a network. Showm in Fig- 
ure 4 is Lucky’s decision directed transversal 
filter [7]. This filter removes temporal smear- 
ing of a digital signal by setting the weights so 
that they unconvolve the dispersive channel 
through which the signal passed. The incom- 
ing signal enters into a tapped analog delay 
line; the tapped values are weighted, summed 
and compared to some threshold value. The 
difference between the weighted sum and the 
binary comparator output forms the error 
value that drives the weight adapting algo- 
rithm. Figure 5 shows a modification to this 
idea so that it implements a spatial filter. 
Rather than being taps on an analog delay 
line, the incoming lines are the outputs from 
similar neighboring pixel filters. The weighted 
sum of neighboring pixels forms a prediction 
of the center pixel A\ When the prediction 
matches the actual value, adaptation of the 
weights will stop. The modified spatial filter 
uses adaptive means to unconvolve any spa- 
tial smearing. Networks similar to this spatial 
filter are also useful for storing vectors in an 
associative memory [8]. 


Figure 4: Decision Directed Filter 
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Figure 5: Modified Spatial Filter 

VII Summary 

Digital compression of video images does in- 
deed seem to be a viable means of transmitting 
HDTV signals. We examined two compres- 
sion algorithms and made some comparisons 
to the human vision system. Vector quan- 
tization gives superior error correction per- 
formance due to the fixed length information 
block it sends. The enhanced DPCM algo- 
rithm compresses images more like the retinal 
image mapping and produces high quality re- 
constructions even near edges and other irreg- 
ular features. The goal for HDTV of course is 
to produce the highest quality picture given 
the allotted bandwidth while still being ro- 
bust against channel disturbances. Neural cir- 
cuitry can help to block up the data output 
from the DPCM algorithm into more regular 
sized “chunks" , thereby incorporating the ad- 
vantages of both techniques into a single pro- 
cess. 
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